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This paper analyzes the information content of medical images, with 3-D MRI images as an 
example, in terms of information entropy. The results of the analysis justify the use of Pixel 
Difference Coding for preserving all information contained in the original pictures, lossless coding 
in other words. The experimental results also indicate that the compression ratio CR=2:1 can 
be achieved under the lossless constraints. A pratical implementation of Pixel Difference Coding 
which allows interactive retrieval of local ROI (Region of Interest), while maintaining the near low 
bound information entropy, is discussed. 
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I. INTRODUCTION 

This research attempts at developing an information 
preserving image coding suitable for handling medical 
images which usually do not allow any degree of image 
degradation in comparison with the source images jl], [| . 
JPEG, one of the well accepted compression standards, 
exhibits an excellent compression gain, but does not 
insure that the decoded result is the exact replica of 
the original. A variation of DPCM or pixel difference 
coding, namely "Guided Scan Pixel Difference Coding 
was studied. A major modification to the classical 
DPCM is the inclusion of scan information, in terms 
of the direction used to calculate the pixel difference 
and step size. This additional information does not 
significantly increase the volume of coded data. It 
actually opens the extensibility for the DPCM to be able 
to handle multi-dimensional image data, in a similar 
manner as JPEG is extended to MPEG to handle movie 
presentation. Another important asp ect of medical 
image coding, which is the ability to access the full 
details of a local image (often referred to as ROI, Region 
of Interest), is also satisfied. Neither the location of a 
ROI nor the scanning method is restricted so that a local 
area of ROI can be selected anywhere in the original 
image. The flexibility in scanning makes it possible to 
use the same code for real time movie presentation in the 
same way MPEG uses JPEG. This research considers 
two stage redundancy removal, one by taking pixel 
differences and the other by an entropic source coding 
similar to the Huffman coding to achieve a redundancy 
removal of approximately 4-5 bpp (bits per pixel). 
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The sizes of medical images have increased with the 
advancement in various medical imaging modalities, 
typically in MRI and X-ray CAT scan. MRI can scan 
a whole body within a reasonable time of 30 minutes 
and X-ray CAT has adopted helical scanning for a 
higher spatial resolution. In the case of 3D scanning, 
the image size could be as large as 100 million pixels. 
Images of this size are usually stored on writable laser 
disks (CD- WORM), but this imposes serious problems 
when the retrieval of such data is attempted through a 
communication channel in the wide area network (WAN) 
environment using 64 kbits/sec bit-rate telephone chan- 
nels. 

The problems associated with the compression of med- 
ical images are two fold. The first requirement is that 
compression must be absolutely non-degradable. Regard- 
less of whether a picture has been affected by measure- 
ment noise that occurs in the process of physical mea- 
surement and in image reconstruction, the source data 
acquired at an imager must be retained without any kind 
of losses. The second condition is the integrity of the ob- 
jects contained in a picture data. Physicians examine 
very closely the image in a ROI (Region of Interest) or 
a number of ROI's. A selected ROI needs to be imme- 
diately accessible and it must be translated into a serial 
stream of compressed data for transmission via a com- 
munication channel. 



II. TECHNICAL BACKGROUND 

The method is best represented by the name Guided 
Scan Pixel Difference Coding ||, a variation of DPCM 
widely used in compressing voice signals. It is well known 
that a PCM voice signal of 6-8 bits can be compressed 
down to a 3 bit DPCM. Even discounting the fact that 
DPCM uses a nonlinear quantization scale having a 
greater step size for a larger signal level, it is easy to 
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remove 3 bits per code word, reducing the total volume 
down to half of the original. In comparison, JPEG 
performs a compression rate of 10 and is better than 
DPCM, because it permits the encoded image to degrade 
somewhat from the original. DPCM is a non-degradable 
source coding if the quantization step size is fixed to the 
same linear scale as already done in the source images. 
Furthermore, the encoding/decoding method is basically 
the simple operation of subtraction/summation. Further 
compression to remove the redundancy leftover from 
DPCM is accomplished by an entropy coding such as 
Huffman code. 

In order to satisfy the other aspect of medical image 
compression, accessibility to a ROI, additional informa- 
tion describing how an individual code representing a 
pixel difference is derived, must be attached at the ex- 
pense of compression gain. In the case of 2D images, 
there are four (up, down, left and right) or eight (if di- 
agonal differences are considered) possible directions by 
which a pixel difference can be calculated. If an ROI 
is specified by a person viewing an image specifying sev- 
eral guide marks surrounding the ROI, the length of each 
raster along a specified scan direction must be registered. 
Also, the step size used to calculate the pixel difference 
must be included, if coarsely sampled images need to be 
transmitted for viewing an overall image at a reduced im- 
age quality. These additional pieces of information could 
be 4 bits for directions and about 3 bits to indicate the 
step size per picture. Information regarding a length of 
scanning depends on how irregular a shape is set to be 
an ROI. Nevertheless, no significant increase in the total 
volume of coded words is expected. The term "Guided 
Scan is meant to include the information as to how an 
image or a ROI is scanned. The way of scanning an image 
should be left for the image coding system to determine, 
so that the total volume of information can be minimized 
for individual selected ROFs. Another interesting aspect 
of the Guided Scan DPCM is the structural similarity 
observed in the forward or backward difference interpo- 
lation algorithm. Since the tree of high order differences, 
necessary for the interpolation, can be calculated easily 
from the first order differences coded by the Guided Scan 
DPCM, there is potential to artificially increase the spa- 
tial resolution of the source image without distorting the 
image. 



/. If eight bits are used to code an image and the proba- 
bility for any number between and 255 to occur is the 
same, the information entropy is 8. If only one value, say 
128, occurs all times, the entropy is zero. Adjacent pixels 
in a digitized image are highly correlated. If two adjacent 
pixels are considered, the probability for the first pixel x 
to take a value i, that for the second pixel y to take j, 
and the joint probability for the pixel x to take a value i 
and the pixel y to take j are given respectively as follows: 



Ni TV, iV l7 - 

Pl= A' Pj = ^F and ^' = lv" (2) 

The entropies based on the joint probability and the 
conditional probabilities are then obtained as follows: 



H(x,y) = Pij log. 



iPij 



H{x\y) = - Vpyloga — 
H{y\x) = -J^-log^ 



(3) 



We also know that H(x, y) = H(x) + H(y\x) = H(y) - 
H(x\y). Since theoretically: 



H{y\x) - H(y) < 



(4) 



The entropy calculated for the second pixel y know- 
ing the occurrence of the first pixel x is smaller than the 
entropy calculated from y alone. The uncertainty coeffi- 
cient of y 



U{y\x) 



H(y)-H(y\x) 

H{y) 



(5) 



is indicative of the dependency of y on i. The sym- 
metrical uncertainty: 



H(x) + H{y) 

yields if a; and y are completely independent and 1 if 
they are completely dependent. 



III. INFORMATION ENTROPY 



The information entropy is defined by: 



H = - ^2 Pi \og 2 p l 



(1) 



i=i 



where pi is the probability associated with the occur- 
rence of a value i when the total number of values used is 



IV. ENTROPY CALCULATED FROM 
STRETCHED EXPONENTIAL PDF 

When the probability density function (PDF), ob- 
tained from a histogram of pixel intensities, is well ap- 
proximated by the stretched exponential probability den- 
sity function § § §: 



p(x) = Ke ( <* ' 



with K 



ft 



2oT(i) 



(7) 
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the theoretical entropy can be calculated from: 



Hniqr(i) + r(i + i) 

H = 2K ]3 (8) 

After evaluating the absolute mean fi and standard 
deviation a, the parameters a and (3 are extracted from 
the equation 0: 



CT 2 + ;i 2 

where, the function F is defined by: 



(9) 



F(i) 



r(ir 



(10) 



and a is defined by: 



Q 



(q 2 + M 2 )r(i) 



(ii) 



Fig. 1 depicts the variation of the entropy J? as a 
function of a in the interval [0.1-5] for a given (3 picked 
in the interval [0.8-1.5]. 



V. PRELIMINARY STUDIES ON 
INFORMATION ENTROPY 

A 3-D MRI brain section image which consists of 64 
slices was analyzed with an attempt to find the lower 
bound of the information entropy for medical images || . 
The lower bound is a definite measure that tells exactly 
how many bits can be removed per pixel as redundancy. 
One slice of the 3-D image is shown in Fig. 2. Two 
different approaches are taken to calculate the entropies 
for the image shown in Fig. 2. 

1. Use Eq. 1 which does not consider the dependency 
between adjacent pixels for n-th order difference 
images (n = 0,...,10). 

2. Use Eq. 3 which takes the dependency of a pixel on 
its immediate neighbours into account (equivalent 
to a first order Markov model). 

The first approach finds a crude information entropy, 
if it is applied to the original picture since the redun- 
dancy due to pixel-to-pixel correlation is still intact. In 
order to find a more accurate information entropy, it is 
necessary to remove the redundancy by some means that 
allows the recovery of the original image. A difference 
image is introduced for this purpose. When a 2-D image 
is denoted by A and z _1 represents the one-step shift op- 
erator towards the positive side of the horizontal axis (or 
vertical or diagonal axis), the first difference is: 



D 1 =A-z- 1 A={l-z- 1 )A (12) 

The second difference is given by 

D 2 = (l-z- 1 )D 1 = (l-z- 1 ) 2 A (13) 

Thus the n-th difference is given by 

D n = (1 - z- x fA (14) 

In order to make reconstruction possible, the first col- 
umn of the first difference must retains the first column 
of the original if horizontal shift is used. The second 
difference must retain the first column of the original 
and the second column of the first difference in its sec- 
ond column. Additional DC restoration columns must 
be progressively added when a new difference image is 
created. Successive applications of this difference oper- 
ator remove the pixel-to-pixel correlation and resulting 
images become gradually more random. Table 1 shows 
how the entropy associated with such a difference image 
varies when n is increased. 



Difference image 


Entropy 


n =0 


5.615405 


n=l 


4.861321 


n=2 


5.448454 


n=3 


6.240144 


n=4 


7.103492 


n=5 


7.997103 


n=6 


8.908238 


n=7 


9.813683 


n=8 


10.725728 


n=9 


11.610693 


n=10 


12.468448 



TABLE I: Entropies (bits/pixel) of Difference Images 



2-D auto-correlation functions for the difference im- 
ages of n= 0, 1 and 2 are shown in Fig. 3. The top 
figure is the 2-D auto-correlation of the original image. 
Comparing the middle n =1 and the bottom n =2, it is 
observed that the central peak of n=2 is sharper (less 
correlated) than that of n=l. As for the entropies calcu- 
lated, the entropy drastically drops to the minimum at 
the very first difference operation then it starts increasing 
as n increases. Successive difference operations seem to 
decorrelate the image and make it more random, but the 
entropy monotonously increases after the first difference. 
This phenomenon can be explained from the frequency 
response of the n-th difference operation described by the 
transfer function of a high pass filter, 



G(z) = (l-z- 1 ) r 



(15) 
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Since the magnitude response is given by: 

\G(e ju) )\ =2"sm"(|) where < w < tt (16) 

and the bandwidth becomes wider as n increases. The 
power of the n-th difference image is greater than that 
of (n-l)th difference image, and so is the variance. 

The second approach using Eq. 3 considers the depen- 
dency between adjacent pixels in calculating information 
entropies so that it is no longer necessary to decorrelate 
images. Since the first difference achieves the minimum 
entropy value, Eq. 3 is applied to the image of the first 
difference. Table 2 summarizes the results. The entropy 
H{x,y) based on the joint probability p^j is divided by 
2 to translate it into the entropy per pixel. This value is 
significantly smaller than H(x) or H{y). 

Another interesting result is found by fitting the 
stretched exponential PDF Q to the histogram of the 
original image shown in Fig. 4. The entropy for the large 
peak located in the lower gray scale range of < 50 is cal- 
culated to be 2.749977 without including the range > 50. 
This is nearly one half of the total entropy found for the 
original image. A whole picture of a medical image usu- 
ally contains a significant portion of dark background. 



Entropy 


value 


H(x) 


4.861343 


H(y) 


4.861343 


H(x,y) 


9.159570 


H(x,y)/2 


4.579785 


H(y\x) 


4.298226 


H(x\y) 


4.298226 



TABLE II: Information Entropies Considering Adjacent 
Pixel-to-pixel Correlation 



VI. DISCUSSION 

Reviewing the experimental results presented in the 
previous sections, some conclusive remarks can be made 
to determine the strategy for information preserving 
source coding. Information preserving coding or lossless 
coding generally means that the picture received is 
the picture sent in terms of the bit structure of the 
image, not in terms of the visual impression of the 
image before and after image compression/transmission. 
In this narrow sense of lossless coding, altering pix- 
els is prohibited. No alterations in quantization are 
permitted. With these strict constraints, the only 
possible source of redundancy that could be removed is 
limited to the pixel-to-pixel correlation. Pixel difference 
coding (DPCM) alone brings the information entropy 



down to its near minimum. As seen in the entropy 
H(x,y), the result of the pixel difference coding can be 
further trimmed but not to a large extent. The MR 
images analyzed are all 8 bit images. According to 
the sample calculations shown in this paper and other 
tested results, a bare-bone information entropy per 
pixel is slightly greater than 4 bits/pixel. An optimistic 
compression ratio CR is therefore CR=2:1, as long 
as a near optimum compression algorithm, typically 
Huffman coding, is used. Blending the Huffman code 
and the codes which consider state transitions that 
frequently occur within a near zero range of the pixel 
difference scale, a small improvement in compression 
ratio will be made. For example, a run of successive 
zeros up to a length of 10 can be treated as a code word 
if the frequencies of such occurrences are sufficiently high. 

Further improvement of CR requires removing a con- 
straint on quantization. As used in DPCM in voice signal 
coding, there are several methods to set up a quantiza- 
tion scale which minimizes the information entropy. If 
the methods of transform coding are allowed, CR can be 
improved drastically. Approaches to control the quality 
of medical images, for example texture, appearance of 
speckles, maintaining repeated basic patterns as fractal 
images do, etc... If one coding method can assure the 
fidelity of a certain image attribute, it might be consid- 
ered as a better coding scheme. The medical community 
may be prepared to accept it as a better replacement for 
the totally reconstructible information preserving coding. 

If some loss is permitted in medical image coding, the 
n-th order difference images discussed in this paper has 
another potential application for image coding. Recall- 
ing that the power spectrum S{u x ,ujy) of an n-th order 
difference image D n (x,y) relates to its auto-correlation 
R(x, y) with the 2-D Fourier transform, 

S(oJ x ,uj y ) = TR(x,y) 

= {TD n {x,y)}{TD n {x,y)Y (17) 

it is apparent that the magnitude information of 
D n (x, y) is contained in the highly concentrated peak of 
R(x, y). Since the phase information is lost in calculating 
R{x, y), it is necessary to preserve the phase of D n (x, y). 
The auto-correlation of /_TD n (x,y) shown in Fig. 5 
indicates that the magnitude of phase is also highly 
concentrated at zero of the 2-D phase auto-correlation. 
The n-th order difference image produces a highly 
concentrated auto-correlation both in magnitude and 
in phase. By preserving the profile of the peak and 
discarding the rest, it seems possible to achieve a high 
compression gain. It is however known that image 
degradation is usually enhanced by inaccurate phase 
estimation. 
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Fig. 1: Theoretical entropy calculated for stretched expo- 
nential probability density functions. The graphs 
show the entropy versus a with (3 as a parameter. 

Fig. 2: A Slice of a 3-D MRI Head Image (256x256) 

Fig. 3: 2D Auto-correlation, original, 1st difference, 2nd 
difference image and phase image of the original 
(from top to bottom) . Top vertical scale is in units 
of 10 7 , middle is in units of 10 5 whereas bottom 
part is in 10 6 units. 

Fig. 4: Histograms for the original, 1st difference and 2nd 
difference images. 

Fig. 5: Auto-correlation of the phase image. Vertical scale 
is in units of 10 4 . 
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